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Abstract 

This  paper  describes  a  large-scale  language-independent  evaluation  of  the  use  of  Thematic  Hierarchies  in 
natural  language  generation.  We  translate  from  a  corpus  of  sentences  reflecting  the  full  variety  of  behavior 
of  Levin-based  verb  classes.  The  corpus  is  used  as  input  to  a  generation  system  that  utilizes  the  same 
thematic  hierarchy  for  realizing  relative  argument  surface  positions  in  two  languages;  English  and  Spanish. 
The  output  was  manually  evaluated  by  English  and  Spanish  speakers.  The  contributions  of  this  work 
include;  (1)  an  improved  thematic  hierarchy  over  an  earlier  implementation;  (2)  a  large-scale  evaluation  of 
the  use  of  thematic  hierarchies  in  two  languages;  (3)  an  implementation  of  a  language  independent  module 
for  natural  language  generation;  and  (4)  the  creation  of  a  single  tool  for  incremental  development  of 
multilingual  lexicons. 
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1  Motivation 

In  (Dorr  et  ah,  1998),  an  implementation  of  thematic 
hierarchies  for  efficient  natural  language  generation 
was  presented.  The  use  of  the  thematic  hierarchy 
was  evaluated  using  a  small  hand-constructed  cor¬ 
pus  of  100  English  sentences  reflecting  a  variety  of 
English  verb  classes  and  alternations.  The  hierar¬ 
chy  was  implemented  using  cascading  rules  within 
the  grammar  formalism  provided  as  part  of  the  nat¬ 
ural  language  realization  engine  Nitrogen  (Langkilde 
and  Knight,  1998a;  Langkilde  and  Knight,  1998b). 
Some  of  the  shortcomings  of  this  earlier  work  are: 
(1)  inadequate  evaluation  due  to  the  use  of  a  small 
test  corpus;  (2)  limitation  of  the  approach  to  one 
language  only  (English);  (3)  lack  of  a  principled  de¬ 
sign  in  the  implementation. 

This  paper  presents  more  systematic  implementa¬ 
tion  of  thematic  hierarchies  and  a  large-scale  eval¬ 
uation  of  their  use  for  generation  in  English  and 
Spanish.  This  evaluation  was  helpful  in  incremen¬ 
tal  development  of  both  the  thematic  hierarchy  and 
the  English  and  Spanish  lexicons. 

2  Research  Context 

The  work  presented  here  is  part  of  the  generation 
component  (Traum  and  Habash,  2000)  of  the  inter¬ 
lingual  Machine  Translation  effort  at  the  University 
of  Maryland  College  Park.  The  generation  com¬ 
ponent  has  also  been  used  in  Cross-Language  In¬ 
formation  Retrieval  research  (Levow  et  ah,  2000). 


The  interlingual  representation  used  is  Lexical  Con¬ 
ceptual  Structure  (LCS),a  compositional  abstraction 
with  language-independent  properties  that  tran¬ 
scend  structural  idiosyncrasies  (Jackendoff,  1983; 
Jackendoff,  1990;  Jackendoff,  1996).  This  represen¬ 
tation  has  been  used  as  the  interlingua  of  several 
projects  such  as  LINITRAN  (Dorr  et  ah,  1993)  and 
MILT  (Dorr,  1997). 

3  Overview  of  Generation  in 

LCS-based  Machine  Translation 

One  of  the  major  challenges  in  natural  language 
processing  is  the  ability  to  make  use  of  existing  re¬ 
sources.  Large  differences  in  syntax,  semantics,  and 
ontologies  of  such  resources  create  significant  bar¬ 
riers  to  their  usage  in  large-scale  applications.  A 
case  in  point  is  the  wide  range  of  “interlingual  rep¬ 
resentations”  used  in  machine  translation  and  cross¬ 
language  processing.  Such  representations  are  be¬ 
coming  increasingly  prevalent,  yet  views  vary  widely 
as  to  what  these  should  be  composed  of,  varying 
from  purely  conceptual  knowledge-representations, 
having  little  to  do  with  the  structure  of  language, 
to  very  syntactic  representations,  maintaining  most 
of  the  idiosyncrasies  of  the  source  languages.  In  our 
generation  system  we  make  use  of  resources  associ¬ 
ated  with  two  different  (kinds  of)  interlingua  struc¬ 
tures;  Lexical  Conceptual  Structure  (LCS),  and  the 
Abstract  Meaning  Representations  (AMR)  used  at 
USC/ISI  (Langkilde  and  Knight,  1998a).  The  two 


Figure  1;  LCS-based  Machine  Translation 


representations  serve  different  but  complementary 
roles  in  the  translation  process.  The  deeper  lexical- 
semantic  expressiveness  of  LCS  is  essential  for  lan¬ 
guage  independent  Lexical  Selection  that  transcends 
translation  divergences.  The  shallower  yet  mixed 
semantic-syntactic  nature  of  AMRs  makes  it  easier 
to  use  for  target  language  realization. 

The  use  of  two  representations  in  generation  mir¬ 
rors  the  use  of  two  representations  on  the  analysis 
side  of  the  MT  system,  in  which  a  parsing  output 
is  passed  to  a  semantic-composition  module;  the 
target-language  AMR  is  analogous  to  the  source- 
language  parse  tree.  (See  Figure  1.)  The  Compo¬ 
sition  module  takes  the  source-language  parse  tree 
and  creates  a  deeper  semantic  representation  (the 
LCS)  using  a  source-language  lexicon.  In  genera¬ 
tion,  the  Decomposition  module  performs  a  reverse 
step  that  uses  a  target-language  lexicon  to  create 
the  parse-like  AMR.  This  step  is  referred  to  as  Lex¬ 
ical  Selection.  It  is  followed  with  the  Realization 
step  in  which  the  Linearization  module  flattens  an 
AMR  into  a  sequence  of  words.  Because  of  the  am¬ 
biguity  inherent  in  all  of  the  involved  modules  from 
the  parser  to  the  lexicons,  multiple  sequences  are 
created.  We  use  the  Statistical  Extraction  module 
of  the  generation  system  Nitrogen  (Langkilde  and 
Knight,  1998a;  Langkilde  and  Knight,  1998b)  to  se¬ 
lect  among  alternative  outputs  when  generating  En¬ 
glish. 

3.1  LCS  Lexicons 

The  LCS  lexicons  used  in  both  analysis  and  genera¬ 
tion  relate  a  lexeme  to  a  Lexical  Conceptual  Struc¬ 
ture  representation.  A  single  verb  might  have  sev¬ 
eral  entries  corresponding  to  different  senses  of  the 
that  verb.  Eigure  2  compares  four  out  of  the  nine 
root  LCS  (RLCS)  entries  for  the  verb  ‘run’  in  the 
English  LCS  Lexicon.  These  entries  are  classified  by 
their  Levin  verb  class  which  is  used  as  a  template 
to  generate  the  RLCSes  for  every  verb  in  the  class. 


The  star-marked  nodes  in  those  entries  signify  the 
location  an  argument  can  be  attached.  A  composed 
LCS  (CLCS)  is  made  up  of  a  RLCS  that  has  its 
star-marked  nodes  filled  with  other  CLCSes.  The 
number  at  the  end  of  the  nodes  mark  the  thematic 
role  associated  with  the  specific  node.  Eor  example, 
1  is  agent,  2  is  theme,  3  is  a  source  particle  (i.e.  an 
oblique)  and  4  is  source  (an  argument).  Eor  a  full 
listing  of  the  thematic  roles  and  their  corresponding 
codes  see  Eigure  3.  The  last  LCS  entry  for  run  in 
Eigure  2  can  be  read  as  a  theme  iking  goes  loca- 
tionally  from  a  source  location  to  a  goal  location  in 
a  running  manner. 

The  current  English  verb  lexicon  contains  over 
11,000  RLCS  entries  such  as  those  in  Eigure  2.  These 
entries  correspond  to  different  senses  of  over  than 
4,000  verbs.  The  Spanish  verb  lexicon  contains  over 
24,000  entries  corresponding  to  3,300  verbs^  The 
LCS  lexicon  also  contains  other  information  of  im¬ 
portance  to  realization  such  as  requirements  for  op- 
tionality  (;OPTIONAL  and  ;OBLIGATORY)  and 
internal/external  positioning  (;INT  and  ;EXT).  Op- 
tionality  markers  are  necessary  to  determine  which 
arguments  must  be  available  in  the  CLCS  for  proper 
generation  using  an  RLCS.  Eor  example,  in  class 
51.3.2.a.i  in  Eigure  2,  the  theme  is  the  only  obliga¬ 
tory  argument.  Internal/external  positioning  mark¬ 
ers  will  be  discussed  later  in  the  section  on  Thematic 
Hierarchy. 

3.2  Lexical  Selection 

The  lexical  selection  process  attempts  to  decom¬ 
pose  a  CLCS  into  RLCSes  corresponding  to  lexemes 
in  the  target  language.  Decomposition  is  basically 
a  complex  algorithm  for  graph  matching/covering 
with  restrictions.  Its  output  is  the  shallower  Ab¬ 
stract  Meaning  Representation  (AMR)  discussed 
earlier.  Different  lexicons  for  different  languages 
provide  different  RLCSes  and  RLCS  restrictions  that 
guide  lexical  selection.  Eigure  4  compares  three  dif¬ 
ferent  possible  decompositions  for  a  CLCS  into  En¬ 
glish,  Spanish  and  Arabic.  The  CLCS  can  be  read  as 
John  causes  himself  to  go  into  a  room  in  a  forceful 
manner.  The  AMR  relation  (;AG,  ;TH,  etc.)  mark¬ 
ing  the  connections  on  the  left-hand  side  in  Eigure 
4  are  created  from  the  thematic  role  information  in 
the  RLCSes. 

3.3  Realization 

Syntactic  realization  is  the  step  that  converts  the  un¬ 
ordered  dependency  tree  structure  of  an  AMR  into  a 
surface  sentence.  There  are  two  operations  involved 
in  realization;  recasting  and  linearization.  Recasting 
converts  an  AMR  node  into  another  AMR  node  with 
added  information,  deleted  information  or  just  mod¬ 
ified  information.  Linearization  specifies  the  relative 

^For  a  detailed  discussion  of  the  acquisition  of  LCS-based 
lexicons,  see  (Dorr  and  Olsen,  1996;  Dorr,  1997). 


Figure  2;  RLCS  entries  for  ‘run’ 

26.3  Verbs  of  Preparing 

(cause  (*  thing  1) 

(go  ident  (*  thing  2) 

(toward  ident  (thing  2) 

(at  ident  (thing  2)  (run+ed  9)))) 

((*  for  17)  poss  (*head*)  (*  thing  18))) 

Example;  John  ran  ike  store  for  Mary. 

Other  verbs  .  bake  boil  clean  cook  fix  fry  grill  iron  mix  prepare 

roast  roll  run  wash  ... 

47. 7. a  Meander  Verbs  (from  to) 

(go_ext  loc  (*  thing  2) 

((*  from  3)  loc  (thing  2) 

(at  loc  (thing  2)  (thing  4))) 

((*  to  5)  loc  (thing  2) 

(at  loc  (thing  2)  (thing  6))) 
(run+ingly  26)) 

Example;  The  river  runs  from  ike  lake  to  ike  sea. 

Other  verbs;  crawl  drop  go  meander  plunge  run  sweep  turn  twist 

47.5.1. b  Swarm  Verbs  (Locational) 

(act  loc  (*  thing  2) 

((*  [at]  10)  loc  (thing  2)  (thing  11)) 
(run+ingly  26)) 

Example;  Tke  dogs  run  in  ike  forest. 

Other  verbs;  bustle  cnwl  creep  run  swurm  swim  teem  ... 

51. 3. 2.  a. i  Run  Verbs  -  (Locational, Theme  only) 

(go  loc  (*  thing  2) 

((*  from  3)  loc  (thing  2) 

([at]  loc  (thing  2)  (thing  4))) 

((*  to  5)  loc  (thing  2) 

([at]  loc  (thing  2)  (thing  6))) 
(run+ingly  26)) 

Example;  Tke  korse  ran  into  ike  field  from  ike 

barn. 

Other  verbs;cii  mb  crawl  fly  jog  jump  leap  race  run  swim  walk  ... 


positions  of  the  children  of  an  AMR  node  to  their 
mother  and  to  each  other.  The  focus  of  this  paper 
is  on  the  specific  linearization  submodule  that  deals 
with  the  problem  of  mapping  thematic  roles  to  sur¬ 
face  positions. 

3.4  Oxygen 

In  (Dorr  et  ah,  1998),  the  grammar  formalism  pro¬ 
vided  as  part  of  the  natural  language  realization  en¬ 
gine  Nitrogen  (Langkilde  and  Knight,  1998a;  Langk- 
ilde  and  Knight,  1998b)  was  used  to  implement  a  lin¬ 
earization  grammar.  The  Nitrogen  grammar  formal¬ 
ism  is  unification  based  and  it  provides  a  small  num¬ 
ber  of  tools  to  recast  and  linearize  AMRs.  There  are 
several  limitations  to  the  use  of  this  formalism.  Eor 
example,  the  grammar  is  interpreted  which  results  in 
inefficient  time/space  use.  Another  limitation  is  that 


Eigure  3;  Inventory  of  Thematic  Roles 


# 

Thematic 

Role 

Definition 

0 

no  thematic  role  assigned 

1 

AG 

agent 

2 

TH  ,EXP 

theme  or  experiencer  or  in¬ 

,INFO 

formation 

3 

SRC() 

sonrce  preposition 

4 

SRC 

sonrce 

5 

GOAL(), 

PRED() 

goal  or  pred  preposition 

6 

GOAL 

goal 

7 

PERC() 

perceived  item  particle 

8 

PERC 

perceived  item 

9 

PRED 

identihcational  predicate 

10 

LOGO 

locational  particle 

11 

LOC 

locational  predicate 

12 

POSS 

possessional  predicate 

13 

TIME() 

temporal  particle  preced¬ 
ing  time 

14 

TIME 

time  for  TEMP  field 

15 

MOD-POSS() 

possessional  particle 

16 

MOD-POSS 

possessed  item  modifier 

17 

BEN() 

beneficiary  particle 

18 

BEN 

benefactive  modifier 

19 

INSTR() 

instrnmental  particle 

20 

INSTR 

instrnment  modifier 

21 

PURP() 

pnrpose  particle 

22 

PURP 

pnrpose  modifier  or  reason 

23 

MOD-LOC() 

location  particle 

24 

MOD-LOG 

location  modifier 

25 

MANNERO 

manner 

26 

reserved  for  conflated  man¬ 
ner 

27 

PROP 

event  or  state 

28 

MOD-PROP 

event  or  state 

29 

MOD-PRED() 

identificational  particle 

30 

MOD-PRED 

property  modifier 

31 

MOD- TIME 

time  modifier 

the  tools  provided  are  rather  simple  transformations 
which  causes  the  linearization  grammars  to  be  long 
and  complex.  Currently  we  are  using  a  different  lin¬ 
earization  engine.  Oxygen  (Habash,  2000).  Oxygen 
is  an  efficient  language-independent  linearization  en¬ 
gine.  Linearization  grammars  for  Oxygen  are  writ¬ 
ten  using  oxyL,  a  powerful  linearization  grammar 
description  language  that  has  the  power  of  a  pro¬ 
gramming  language  with  the  focus  on  natural  lan¬ 
guage  linearization.  oxyL  grammars  are  compiled 
into  programs  that  run  independently. 

The  power  of  oxyL  is  accomplished  by  providing 
recasting  mechanisms  for  the  most  common  needs  of 
a  linearization  grammar  and  also  by  allowing  embed¬ 
ding  of  code  in  a  standard  programming  language 
(Lisp).  The  oxyL  linearization  grammars  are  also 
simple,  clear,  concise  and  easily  extendible.  The 
simplicity  of  oxyL  grammars  is  apparent  when  one 
considers  issues  of  redundancy;  the  handling  of  am- 


[  break  ] 

X^GOAL-PART 
(  John  )  (  into  ) 

I  :GOAL 
[  room  ) 


John  broke  into  the  room 


[  forzar  ] 


Johnforzo  la  entrada  en  el  cuarto 


[igtaHam) 

;  \^G0  AL 

John  ')  (^orfa 


iqtaHama  John  algorfata 


Figure  4:  Different  CLCS  Decompositions 

biguities  at  every  phrase  rule  is  hidden  from  the  lin¬ 
earization  grammar  designer  and  is  treated  only  in 
the  compiler  and  support  library.  For  a  detailed  pre¬ 
sentation  of  oxyL’s  syntax,  see  (Habash,  2000).  An 
example  of  a  segment  of  an  oxyL  linearization  gram¬ 
mar  is  provided  in  Figure  5  and  will  be  explained  in 
the  next  section. 

4  The  Thematic  Hierarchy 

The  unordered  nature  of  siblings  under  an  AMR 
node  complicates  the  mapping  between  AMR  rela¬ 
tions  and  their  surface  positions.  In  the  case  of  the¬ 
matic  role  ordering,  the  situation  is  more  compli¬ 
cated  by  the  lack  of  one-to-one  mapping  between  a 
particular  thematic  role  and  an  argument  position. 
For  example,  a  theme  can  be  the  subject  in  some 
cases  and  it  can  be  the  object  in  others  or  even  an 
oblique.  Observe  cookie  in  (1). 

(1)  (i)  John  ate  a  cookie  (object) 

(ii)  ike  cookie  contains  chocolate  (subject) 

(hi)  she  nibbled  ai  a  cookie  (oblique) 

To  solve  this  problem,  a  thematic  hierarchy  is 
used  to  determine  the  argument  position  of  a  the¬ 


matic  role  based  on  its  co-occurrence  with  other  the¬ 
matic  roles.  Several  researchers  have  proposed  dif¬ 
ferent  versions  of  thematic  hierarchies  (see  (Jackend- 
off,  1972;  Carrier-Duncan,  1985;  Bresnan  and  Kan- 
erva,  1989;  Kiparsky,  1985;  Larson,  1988;  Giorgi, 
1984;  Wilkins,  1988;  Nishgauchi,  1984;  Alsina 
and  Mchombo,  1993;  Baker,  1989;  Grimshaw  and 
Mester,  1988)).^  The  hierarchy  proposed  in  (Dorr 
et  ah,  1998)  differs  from  these  in  that  it  separates 
(non-adjunct)  arguments  from  obliques  (i.e.  adjunct 
arguments)  and  provides  a  more  complete  list  of  the¬ 
matic  roles  (31  roles  overall)  than  those  of  previous 
approaches  (maximum  of  8  roles).  See  Figure  3  for 
a  complete  listing  for  the  thematic  roles  used.  The 
following  is  final  thematic  hierarchy  for  arguments. 

(2)  special  case  —  ag  src  th 

ext  >  ag  >  instr  >  th  >  perc  >  \* 

In  the  case  of  the  occurrence  of  theme  alone,  it  is 
mapped  to  first  argument  position.  If  a  theme  and 
an  agent  occur,  the  agent  is  mapped  to  first  argu¬ 
ment  position  and  the  theme  is  mapped  to  second 
argument  position.  When  an  agent,  a  theme  and  a 
source  co-occur.  The  order  in  the  hierarchy  is  vio¬ 
lated  as  in  Johuag  charged  Paulgrc  $40ih-  The  term 
ext  is  used  to  handle  verbs  that  violate  the  thematic 
hierarchy.  It,  ext,  refers  to  an  externally  marked 
thematic  role  such  as  the  perceived  John  in  Joknperc 
pleases  Maryth  versus  Maryth  likes  Joknperc-  This 
information  is  provided  in  the  RLGS  lexicon  entry 
using  the  special  marker  ;EXT.  The  use  of  the  the¬ 
matic  hierarchy  eliminates  the  need  to  specify  the 
thematic  role  to  surface  position  mapping  in  every 
verb  lexicon  entry. 

As  for  the  ordering  of  obliques,  an  ad  hoc  order 
was  established; 

(3)  particle  >  mod-prop()  >  perc()  >  th()  > 
purpO  >  mod-loc()  >  mod-pred()  > 
src()  >  goalO  >  mod-poss()  >  ben() 

Note  that  the  order  of  obliques  is  not  a  strict  hier¬ 
archy  but  rather  a  possible  topological  sort.  A  more 
detailed  discussion  is  available  in  (Dorr  et  ah,  1998). 

4.1  Thematic  Hierarchy  Implementation 

Oxygen’s  linearization  grammar  description  lan¬ 
guage,  oxyL,  provides  a  hierarchical  data  recasting 
operator  that  simplifies  the  implementation  of  the¬ 
matic  hierarchy  mapping^  (see  Figure  5).  The  top 

^For  an  excellent  overview  and  a  comparison  of  different 
thematic  hierarchies  see  (Levin  and  Rappaport  Hovav,  1996) . 

^Another  example  of  hierarchically  ordered  linguistic  phe¬ 
nomena  is  the  linearization  of  auxiliaries  relative  to  the  neg¬ 
ative  particle  in  the  English  verb  phrase.  The  auxiliaries  are 
strictly  ordered  by  the  part  of  speech  (Modal  Have  Be-j-en 
Be-l-ing).  The  negative  particle  ’not’  must  appear  after  the 
first  auxiliary  regardless  of  its  part  of  speech.  A  hierarchical 
mapping  of  the  auxiliaries  into  (Auxl  Aux2  Aux3  and  Aux4) 
is  a  simpler  solution  than  listing  all  combinations. 


Figure  5;  Oxyl  Implementation  of  the  thematic  hi¬ 
erarchy 


: Recast  &TH-order 

(@this  <? 

(: movsrc  /  (:src)) 

(Stand  (Stex  :ag)  (Stex  :th)) 

<! 

((:subj  :objl  :obj2)  / 

( : ext  : sub  : ag  :instr  : movsrc  :th 

: src  :perc  :goal  :mod-poss 
:mod-loc  :mod-pred  :loc  :poss 
:pred  :prop  :time  :ben  :purp))) 

:Rule  7,S 

(->(@subj  @inst  @objl  @obj2)) 

part  of  Figure  5  defines  the  thematic  hierarchy  or¬ 
dering  as  follows;  given  the  current  node  (®this), 
conditionally  recast  (<?)  the  relation  :src:  into 
:  movsrc  if  it  co-occurs  with  :  ag  and  :  th;  then  hi¬ 
erarchically  recast  (< ! )  all  available  argument  the¬ 
matic  roles  into  the  grammatical  roles  :  sub  j ,  :  ob  j  1 
and  :obj2.  The  Linearization  rule  */,S  specifies 
the  relative  position  of  the  arguments  to  the  verb 
(Sinst).  The  separation  between  Recasting  and 
Linearization  breaks  up  the  problem  of  mapping  a 
thematic  role  to  a  surface  position  into  two  sub¬ 
problems;  mapping  a  thematic  role  into  a  grammat¬ 
ical  role  (subject, object)  and  mapping  a  grammati¬ 
cal  role  into  a  surface  position.  The  recasting  and 
linearization  rules  are  only  fired  if  the  AMR  node 
being  linearized  is  a  verb.  The  linearization  rule  in 
our  implementation  specifies  the  relative  location  of 
the  obliques.  They  are  permuted  at  the  end  of  the 
sentence. 

4.2  Incorporation  of  Spanish 

The  linearization  component  that  includes  the  the¬ 
matic  hierarchy  mapping  was  implemented  using 
Oxygen  (Habash,  2000).  The  linearization  grammar 
was  very  simple  concentrating  on  argument  word  or¬ 
der  relative  to  the  verb  using  the  same  thematic  hi¬ 
erarchy  described  in  Figure  5. 

To  incorporate  Spanish  in  our  current  implemen¬ 
tation,  we  replaced  complex  Spanish  morphology 
with  the  simple  ’near-future’  construction  (va  ir  a  + 
INF).  For  example,  alguieuag  va  a  colocar  algoih  en 
algo  goal-  In  addition  to  the  lack  of  a  complete  phrase 
structure  for  parts  of  speech  other  than  verbs,  the 
Spanish  linearization  grammar  doesn’t  handle  Pro¬ 
drop  or  clitics.  In  principle,  both  phenomena  can  be 
handled  with  a  recast  rule  that  would  fire  after  the 
thematic  hierarchy  recast.  In  the  case  of  pro  drop,  it 
conjugates  the  verb  and  makes  the  subject  null.  And 
in  the  case  of  clitics,  it  adds  a  clitic  that  matches  the 
gender  and  number  of  the  object. 


Verb  Class 

Example 

2 

somethingag  wanted  somethingth  (to 
do  somethingthjprop 

10.5 

someoneag  stole  somethingth  from 
somethingsrc  for  somethingi,en 

22.1.C 

someoneag  mixed  somethingth  into 
somethinggoai 

29.1.B 

someoneth  considered  somethingperc 
(to  be  somepropertypred)mod-pred 

45. 2. A 

someoneag  folded  somethingth  with 
somethingjnst 

55.1.C 

someoncth  continned  (to  do 

somethingthjprop 

Table  1;  CLCS  Test  Corpus  Examples 

In  the  next  section  we  evaluate  the  use  of  the  the¬ 
matic  hierarchy  for  English  and  Spanish  generation. 
The  fact  that  English  and  Spanish  are  both  SVO  lan¬ 
guages  doesn’t  lessen  the  validity  of  the  evaluation 
since  the  role  of  thematic  hierarchies  is  not  to  map 
the  thematic  roles  to  surface  positions  but  rather  to 
the  syntactic  level  (i.e.  agent,  theme,  goal  to  gram¬ 
matical  roles  such  as  subject,  object  and  indirect 
object).  Einal  linearization  is  responsible  for  placing 
the  subject  and  object  appropriately  on  the  surface. 
The  similarity  of  surface  word  order  between  Span¬ 
ish  and  English  should  be  seen  as  a  normalization 
factor  in  testing  the  mapping  from  thematic  roles  to 
grammatical  roles. 

5  Evaluation 

In  this  evaluation,  a  test  corpus  of  453  sim¬ 
ple  CLCSes  corresponding  to  all  Levin  English 
verb  classes  and  alternations  was  constructed  semi- 
automatically.  The  test  corpus  size  guarantees 
large-scale  coverage  over  verb  behavior  and  the¬ 
matic  role  combinations,  which  is  exhaustive  for  our 
purpose.  The  CLCSes  were  constructed  by  ran¬ 
domly  selecting  an  LCS  verb  entry  from  each  class 
from  the  English  verb  class  and  filling  all  its  ar¬ 
gument  positions  with  simple  noun  phrases  (e.g. 
somethingth,  someonCag  ,etc.)  or  simple  subordi¬ 
nate  clauses  (e.g.  (to  do  sometkingjprop ,  (to  be 
someproperty)mod-prop  nIc-)-  Table  1  shows  some 
sample  English  sentences  corresponding  to  the  CLC¬ 
Ses  in  the  test  corpus. 

Eor  the  purposes  of  this  evaluation,  statistical  ex¬ 
traction  was  disabled  because  we  do  not  have  a  Ni¬ 
trogen  bigram  model  for  Spanish. 

The  CLCS  test  corpus  was  fed  to  the  generation 
system  in  two  different  runs  each  of  which  using  a 
different  target  language  lexicon  and  oxyL  lineariza¬ 
tion  grammar.  The  results  of  the  generation  are 
passed  to  two  speakers  of  English  and  Spanish  re¬ 
spectively  to  evaluate  the  word  order  of  the  realized 
text.  Evaluators  were  asked  to  mark  sentences  as  be- 


II 

Generated 

Classes 

Word  Order 
Error 

English 

428 

9%  (40  classes) 

Spanish 

254 

2%  (4  classes) 

Table  2;  Initial  Evaluation  Results 

lug  acceptable  or  not  acceptable  as  far  as  the  word- 
order  of  the  arguments  relative  to  the  verb^.  Some 
of  the  English  and  Spanish  sentences  failed  the  lex¬ 
ical  selection  process  due  to  problems  with  lexicon 
entries;  these  sentences  never  made  it  to  lineariza¬ 
tion. 

In  the  cases  that  survived,  the  lexical  selection 
process  appropriately  generated  multiple  sentences 
for  each  CLCS.  In  the  case  of  English,  they  all  cor¬ 
rectly  corresponded  to  various  related  alternations 
of  the  main  verb.  Eor  example,  each  of  the  two  sub¬ 
classes  defining  the  dative  alternations  for  the  verb 
send  generated  each  other  (i.e.  John  sent  a  book 
to  Paul  and  John  sent  Paul  a  book).  There  were 
also  cases  of  overgeneration  resulting  from  preposi¬ 
tion  under-specification,  which  is  inconsequential  to 
our  evaluation(e.g.  go  (to, toward, towards, to  at,  etc.) 
somewhere). 

On  the  other  hand,  in  Spanish,  there  were  many 
more  sentences  that  should  not  have  been  gener¬ 
ated.  In  theory,  the  lexical  selection  process  limits 
the  number  of  choices  using  the  LCS  entry  of  the 
Spanish  verbs.  But  that  process  is  only  as  good  as 
the  lexicon  entries  are.  In  cases  where  a  bad  sense 
is  allowed  in  the  translation,  the  sentence  involved 
is  dropped  from  the  evaluation.  This  evaluation  was 
quite  helpful  in  pinpointing  the  locations  of  problems 
in  our  Spanish  (and  also  English)  lexicons.  Table  2 
displays  the  results  of  the  evaluation.  The  first  col¬ 
umn  represents  the  number  of  generated  classes  or 
CLCS  instances  (out  of  N  =  453)  that  actually  went 
through  the  whole  system.  Most  failures  in  Spanish 
generation  are  due  to  missing  verb  entries  (29%  of 
all  input  classes).  An  additional  5%  of  classes  was 
dropped  out  of  the  evaluation  for  having  no  correct 
sense  output.  The  second  column  describes  the  ratio 
of  classes  with  partially  wrong  or  fully  wrong  word 
order  output  to  the  number  of  generated  classes.  In 
English,  out  of  428  classes,  30  classes  had  partially 
wrong  output  and  10  classes  had  no  correct  output. 
In  Spanish,  out  of  the  254  classes  that  generated  out¬ 
put,  only  four  classes  had  wrong  word  order  output. 

The  next  section  describes  the  errors  encountered 
in  the  evaluation  and  how  they  were  fixed. 


“^Actually,  the  evaluation  contained  several  other  criteria 
that  are  more  relevant  to  evaluating  lexical  selection  such  as 
completeness  of  argument  realization  and  appropriateness  of 
sense  selection. 


Eigure  6;  New  Oxyl  Implementation  of  the  English 
thematic  hierarchy 


:  Recast  StTH-order 

(@this  <? 

(:mov  /  ( : src  :goal  :ben)) 

(Stand  (Stex  :  ag)  (Stex  :th)) 

<! 

((:subj  :objl  :obj2)  / 

( : ext  : sub  : ag  : instr  :mov  :th 

: src  :perc  :goal  :mod-poss 
:mod-loc  :mod-pred  :loc  :poss 
:pred  :prop  :time  :ben  :purp))) 

:Rule  %S 

(->(@subj  @inst  @objl  @obj2)) 

6  Discussion 

The  word-order  errors  in  the  English  test  belong 
to  one  of  two  types;  Eirst,  there  are  lexicon  er¬ 
rors  where  specific  realization  information  such  as 
;EXT  is  missing  from  an  entry.  This  problem  ap¬ 
peared  in  three  subclasses  of  class  41.3.1  (Simple 
Verbs  of  Dressing;  don,  doff  and  wear).  In  our  lex¬ 
icon,  clothes,  the  object  for  all  three  verbs,  is  con¬ 
sidered  the  theme  and  the  subject  of  the  sentence 
is  the  goal,  source  and  location  respectively.  Eix- 
ing  these  cases  is  a  matter  of  adding  the  appropri¬ 
ate  piece  of  information  in  the  lexicon.  The  second 
type  of  errors  were  true  thematic  hierarchy  errors; 
The  case  of  agent-benefactor-theme  co-occurrence 
such  as  John  bought  Paul  a  house  and  agent-goal- 
theme  co-occurrence  such  as  John  gave  Paul  a  house. 
These  two  should  be  part  of  the  special  case  of  the 
thematic  hierarchy  that  deals  with  English  verbs’ 
indirect  objects.  Eigure  6  displays  the  updated  the¬ 
matic  hierarchy  for  English.  In  this  implementation, 
a  temporary  role  ;M0V  is  created  to  mark  source, 
goal  or  benefactor  as  moved  arguments  in  a  special 
conditional  recasting  step  that  depends  on  the  co¬ 
occurrence  of  any  of  these  roles  with  an  agent  and  a 
theme. 

The  Spanish  errors  are  much  less  than  the  English 
and  are  basically  a  subset  of  the  first  type  of  errors 
described  above.  The  fact  that  the  special  case  of 
the  thematic  hierarchy  for  English  was  included  and 
it  did  not  cause  any  problem  to  the  Spanish  is  not 
surprising  since  Spanish  lexical  selection  doesn’t  al¬ 
low  the  thematic  roles  agent  and  theme  to  co-occur 
with  the  arguments  source,  goal  or  benefactor.  The 
third  argument  is  always  generated  as  an  oblique. 
Eor  example,  Juan  le  dio  un  libro  a  Paolo  and  Juan 
le  compro  un  libro  a  Paolo^ .  The  updated  thematic 

^We  are  aware  that  a  more  fluent  Spanish  would  move 
the  oblique  (a  Paolo)  closer  two  the  verb  as  in  Juan  le  dio  a 
Paolo  un  libro  and  Juan  le  Comoro  a  Paolo  un  libro.  However 


Figure  7;  New  Oxyl  Implementation  of  the  Spanish 
thematic  hierarchy 


: Recast  &TH-order 

(@this  <!  ((:subj  :objl  :obj2)  / 

( : ext  : sub  : ag  :instr  :th 
: src  :perc  :goal  :mod-poss 
:mod-loc  :mod-pred  :loc  :poss 
:pred  :prop  :time  :ben  :purp))) 


:Rule  %S  (->(@subj  @inst  @objl  @obj2)) 


II 

Generated 

Classes 

Word  Order 
Error 

English 

428 

1%  (5  classes) 

Spanish 

254 

2%  (4  classes) 

Table  3;  Final  Evaluation  Results 

hierarchy  for  Spanish  is  described  in  Figure  7. 

We  ran  a  second  evaluation  that  uses  the  new  im¬ 
plementations.  The  results  are  presented  in  Table 
3.  For  English,  all  of  the  classes  with  partially  cor¬ 
rect  word  order  and  half  of  the  incorrect  word  order 
classes  were  corrected  (88%  of  all  erroneous  classes). 
In  the  Spanish  case,  as  expected,  the  results  did  not 
change. 

Clearly,  the  results  show  that  the  use  of  a  thematic 
hierarchy  for  generating  both  English  and  Spanish 
word  order  is  successful  and  is  supportive  of  earlier 
work  (Dorr  et  ah,  1998).  The  next  step  in  this  on¬ 
going  investigation  is  to  test  the  use  of  the  thematic 
hierarchy  with  a  language  that  has  a  different  gram¬ 
matical  role  to  surface  position  mapping  from  that 
of  English  or  Spanish. 

7  Future  Work 

A  major  remaining  step  is  to  correct  the  problems  in 
the  English  and  Spanish  lexicons  and  to  investigate 
the  source  of  errors  and  incorrect  sense  selection.  An 
investigation  in  the  behavior  of  obliques  in  Spanish 
is  necessary  to  produce  fully  fluent  Spanish  output. 
Another  topic  of  interest  is  the  reusability  of  the  the¬ 
matic  hierarchy  with  other  languages  that  are  much 
more  different  than  Spanish  is  to  English.  We  are 
currently  investigating  Chinese;  a  preliminary  study 
showed  some  promising  results  as  far  as  thematic  hi¬ 
erarchy  mapping.  However  Chinese  seems  to  require 
more  complex  linearization  rules  and  post-lexical  se¬ 
lection  manipulations  especially  for  obliques. 

this  is  not  part  of  the  focus  of  our  evaluation.  The  behavior 
of  obliques  is  something  we  plan  to  investigate  in  a  separate 
study. 
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